Reliability Design for Large Scale Data Warehouses

نویسندگان

  • Kai Du
  • Zhengbing Hu
  • Huaimin Wang
  • Yingwen Chen
  • Shuqiang Yang
  • Zhijian Yuan
چکیده

Data reliability has been drawn much concern in large-scale data warehouses with 1PB or more data. It highly depends on many inter-dependent system parameters, such as the replica placement policies, number of nodes and so on. Previous work has roughly and separately discussed the individual impacts of these parameters, and seldom provided their optimal values, nor mentioned their optimal combination. In this paper, we present a new object-based-repairing Markov model. Based on analyzing this model in three popular replica placement policies, we figure out the individual optimal values of these parameters at first, and then work out their optimal combination by GA. Compared with the existing models, our model is easier to solve while reaching more integrative and practical conclusions. These conclusions can effectively instruct the designers to build more reliable large-scale data warehouses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THE OPTIMIZATION OF LARGE-SCALE DOME TRUSSES ON THE BASIS OF THE PROBABILITY OF FAILURE

Metaheuristic algorithms are preferred by the many researchers to reach the reliability based design optimization (RBDO) of truss structures. The cross-sectional area of the elements of a truss is considered as design variables for the size optimization under frequency constraints. The design of dome truss structures are optimized based on reliability by a popular metaheuristic optimization tec...

متن کامل

Solving a mathematical model with multi warehouses and retailers in distribution network by a simulated annealing algorithm

Determination of shipment quantity and distribution problem is an important subject in today’s business. This paper describes the inventory/distribution network design. The system addresses a class of distribution network design problem, which is characterized by multiple products family, multiple warehouses and retail-ers. The maximum capacity of vehicles and warehouses are also known. The res...

متن کامل

A conic quadratic model for supply chain network design under hub, capacity, delay and lost sale

In this paper, mathematical models are proposed for simultaneously modeling location and inventory control decisions in a four echelon supply chain network considering capacity. The echelons considered in this paper include suppliers, warehouses, hubs and retailers. The aim of the model is to minimize the location, transportation and inventory control costs. Hence, a non-linear mixed integer pr...

متن کامل

Issues in Developing Very Large Data Warehouses

The size of The Boeing Company posts some stringent requirements on data warehouse design and implementation. We summarize four interesting and challenging issues in developing very large scale data warehouses, namely failure recovery, incremental update maintenance, cost model for schema design and query optimization, and metadata definition and management. For each issue, we give the reasons ...

متن کامل

A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses

Conventional data warehouses employ the query-at-a-time model, which maps each query to a distinct physical plan. When several queries execute concurrently, this model introduces contention, because the physical plans—unaware of each other—compete for access to the underlying I/O and computation resources. As a result, while modern systems can efficiently optimize and evaluate a single complex ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCP

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2008